Search results for "Lexical database"
showing 6 items of 6 documents
Sub-symbolic Encoding of Words
2003
A new methodology for sub-symbolic semantic encoding of words is presented. The methodology uses the WordNet lexical database and an ad hoc modified Sammon algorithm to associate a vector to each word in a semantic n-space. All words have been grouped according to the WordNet lexicographers’ files classification criteria: these groups have been called lexical sets. The word vector is composed by two parts: the first one, takes into account the belonging of the word to one of these lexical sets; the second one is related to the meaning of the word and it is responsible for distinguishing the word among the other ones of the same lexical set. The application of the proposed technique over all…
Wordnet and semidiscrete decomposition for sub-symbolic representation of words
2009
A methodology for sub-symbolic semantic encoding of words is presented. The methodology uses the standard, semantically highly-structured WordNet lexical database and the SemiDiscrete matrix Decomposition to obtain a vector representation with low memory requirements in a semantic n-space. The application of the proposed algorithm over all the WordNet words would lead to a useful tool for the sub-symbolic processing of texts.
LEXOP: a lexical database providing orthography-phonology statistics for French monosyllabic words.
1999
During the last 20 years, psycholinguistic research has identified many variables that influence reading and spelling processes. We describe a new computerized lexical database, LEXOP, which provides quantitative descriptors about the relations between orthography and phonology for French monosyllabic words. Three main classes of variables are considered: consistency of print-to-sound and sound-to-print associations, frequency of orthography-phonology correspondences, and word neighborhood characteristics.
On the advantages of word-frequency and contextual diversity measures extracted from subtitles: the case of Portuguese
2015
Accepted manuscript. Epub ahead of print, 29 Sep. 2014.
Miten viittomakielen korpusta luodaan ja mihin sitä tarvitaan? Viittomakielten korpukset ja niiden tehtävät
2020
Artikkeli käsittelee suomalaisen ja suomenruotsalaisen viittomakielen korpusten luontia CFINSL-projektissa (Corpus project of Finland’s sign languages, Suomen viittomakielten korpusprojekti). Viittomakielillä ei ole kirjoitettua muotoa, joten korpusten laatiminen vaatii erilaista lähestymistä kuin korpusten luonti sellaisille puhutuille kielille, joilla on kirjoitettu muoto. Artikkelissa kuvataan ne menetelmät, joilla Jyväskylän yliopiston viittomakielen keskuksessa on koottu aineistoa suomalaisen ja suomenruotsalaisen viittomakielen korpukseen. Lisäksi kuvataan korpusaineiston teknistä käsittelyä, annotointia, metatietojen keruuta ja käsittelyä sekä aineiston säilytystä ja tutkijoiden käyt…
GreekLex 2: A comprehensive lexical database with part-of-speech, syllabic, phonological, and stress information
2017
Databases containing lexical properties on any given orthography are crucial for psycholinguistic research. In the last ten years, a number of lexical databases have been developed for Greek. However, these lack important part-of-speech information. Furthermore, the need for alternative procedures for calculating syllabic measurements and stress information, as well as combination of several metrics to investigate linguistic properties of the Greek language are highlighted. To address these issues, we present a new extensive lexical database of Modern Greek (GreekLex 2) with part-of-speech information for each word and accurate syllabification and orthographic information predictive of stre…